The robustness of Text-to-SQL parsers against adversarial perturbations plays a crucial role in delivering highly reliable applications. Previous studies along this line primarily focused on perturbations in the natural language question side, neglecting the variability of tables. Motivated by this, we propose the Adversarial Table Perturbation (ATP) as a new attacking paradigm to measure the robustness of Text-to-SQL models. Following this proposition, we curate ADVETA, the first robustness evaluation benchmark featuring natural and realistic ATPs. All tested state-of-the-art models experience dramatic performance drops on ADVETA, revealing models' vulnerability in real-world practices. To defend against ATP, we build a systematic adversarial training example generation framework tailored for better contextualization of tabular data. Experiments show that our approach not only brings the best robustness improvement against table-side perturbations but also substantially empowers models against NL-side perturbations. We release our benchmark and code at: https://github.com/microsoft/ContextualSP.
translated by 谷歌翻译
光学计算是一种新兴技术,用于下一代高效人工智能(AI),其速度和效率超高。电磁场模拟对于光子设备和电路的设计,优化和验证至关重要。但是,昂贵的数值模拟显着阻碍了光子电路设计循环中的可扩展性和转环。最近,已经提出了物理信息的神经网络来预测具有预定义参数的部分微分方程(PDE)的单个实例的光场解。它们复杂的PDE公式和缺乏有效的参数化机制限制了其在实际模拟方案中的灵活性和概括。在这项工作中,首次提出了一个被称为Neurolight的物理敏捷神经操作员框架,以学习一个频率域的麦克斯韦PDE家族,以进行超快速的参数光子设备模拟。我们通过几种新技术来平衡神经照明的效率和概括。具体而言,我们将不同的设备离散到统一域中,代表具有紧凑型波的参数PDE,并通过掩盖的源建模编码入射光。我们使用参数效率高的跨形神经块设计模型,并采用基于叠加的增强来进行数据效率学习。通过这些协同方法,神经亮像可以概括为大量的看不见的模拟设置,比数值求解器显示了2个磁性的模拟速度,并且比先前的神经网络模型优于降低54%的预测误差,而降低了约44%的参数。 。我们的代码可在https://github.com/jeremiemelo/neurolight上找到。
translated by 谷歌翻译
事件摄像机最近在高动力或具有挑战性的照明情况下具有强大的常规摄像头的潜力,因此摄影机最近变得越来越受欢迎。通过同时定位和映射(SLAM)给出了可能受益于事件摄像机的重要问题。但是,为了确保在包含事件的多传感器大满贯上进展,需要新颖的基准序列。我们的贡献是使用包含基于事件的立体声摄像机,常规立体声摄像机,多个深度传感器和惯性测量单元的多传感器设置捕获的第一组基准数据集。该设置是完全硬件同步的,并且经过了准确的外部校准。所有序列都均均均均由高度准确的外部参考设备(例如运动捕获系统)捕获的地面真相数据。各个序列都包括小型和大型环境,并涵盖动态视觉传感器针对的特定挑战。
translated by 谷歌翻译
将监督学习的力量(SL)用于更有效的强化学习(RL)方法,这是最近的趋势。我们通过交替在线RL和离线SL来解决稀疏奖励目标条件问题,提出一种新颖的阶段方法。在在线阶段,我们在离线阶段进行RL培训并收集推出数据,我们对数据集的这些成功轨迹执行SL。为了进一步提高样本效率,我们在在线阶段采用其他技术,包括减少任务以产生更可行的轨迹和基于价值的基于价值的内在奖励,以减轻稀疏的回报问题。我们称此总体算法为阶段性的自我模拟还原(Pair)。对稀疏的奖励目标机器人控制问题(包括具有挑战性的堆叠任务),对基本上优于非强调RL和Phasic SL基线。 Pair是第一个学习堆叠6个立方体的RL方法,只有0/1成功从头开始奖励。
translated by 谷歌翻译
大多数传统人群计数方法利用完全监督的学习框架来学习场景图像和人群密度映射之间的映射。在这种完全监督培训设置的情况下,需要大量昂贵且耗时的像素级注释,以产生密度图作为监控。减少昂贵标签的一种方法是利用未标记图像之间的自我结构信息和内在关系。与利用原始图像级别的这些关系和结构信息的先前方法不同,我们从潜在特征空间探讨了这种自我关系,因为它可以提取更丰富的关系和结构信息。具体而言,我们提出了S $ ^ 2 $ FPR,其可以提取结构信息,并在潜在空间中学习粗良好的金字塔特征的部分订单,以便更好地与大规模未标记的图像计数。此外,我们收集了一个新的未标记的人群计数数据集(Fudan-UCC),总共有4,000张图片进行培训。一个副产物是我们提出的S $ ^ 2 $ FPR方法可以利用未标记图像之间的潜在空间中的众多部分订单来加强模型表示能力,并减少人群计数任务的估计误差。关于四个基准数据集的大量实验,即UCF-QNRF,Shanghaitech Parta和Partb以及UCF-CC-50,与先前半监督方法相比,我们的方法显示了我们的方法。源代码和数据集可用于https://github.com/bridgeqiqi/s2fpr。
translated by 谷歌翻译
Facebook Ai提出的仇恨模因挑战吸引了世界各地的参赛者。挑战重点是检测多媒体模因中的仇恨言论。各种最先进的深度学习模型已应用于此问题,并且挑战排行榜的性能也一直不断提高。在本文中,我们增强了仇恨检测框架,包括利用DETECTRON进行特征提取,探索具有不同丢失功能的Visualbert和Uniter模型的不同设置,研究了仇恨模因和敏感文本特征之间的关联,以及最终构建集合方法提升模型性能。我们微调的Visualbert,Uniter和Ensemble方法的Auroc分别在挑战的测试集中实现了0.765,0.790和0.803,这击败了基线模型。我们的代码可在https://github.com/yatingtian/hateful-meme获得
translated by 谷歌翻译
In this paper, we study the problem of knowledge-intensive text-to-SQL, in which domain knowledge is necessary to parse expert questions into SQL queries over domain-specific tables. We formalize this scenario by building a new Chinese benchmark KnowSQL consisting of domain-specific questions covering various domains. We then address this problem by presenting formulaic knowledge, rather than by annotating additional data examples. More concretely, we construct a formulaic knowledge bank as a domain knowledge base and propose a framework (ReGrouP) to leverage this formulaic knowledge during parsing. Experiments using ReGrouP demonstrate a significant 28.2% improvement overall on KnowSQL.
translated by 谷歌翻译
Weakly-supervised object localization aims to indicate the category as well as the scope of an object in an image given only the image-level labels. Most of the existing works are based on Class Activation Mapping (CAM) and endeavor to enlarge the discriminative area inside the activation map to perceive the whole object, yet ignore the co-occurrence confounder of the object and context (e.g., fish and water), which makes the model inspection hard to distinguish object boundaries. Besides, the use of CAM also brings a dilemma problem that the classification and localization always suffer from a performance gap and can not reach their highest accuracy simultaneously. In this paper, we propose a casual knowledge distillation method, dubbed KD-CI-CAM, to address these two under-explored issues in one go. More specifically, we tackle the co-occurrence context confounder problem via causal intervention (CI), which explores the causalities among image features, contexts, and categories to eliminate the biased object-context entanglement in the class activation maps. Based on the de-biased object feature, we additionally propose a multi-teacher causal distillation framework to balance the absorption of classification knowledge and localization knowledge during model training. Extensive experiments on several benchmarks demonstrate the effectiveness of KD-CI-CAM in learning clear object boundaries from confounding contexts and addressing the dilemma problem between classification and localization performance.
translated by 谷歌翻译
Dynamic treatment regimes assign personalized treatments to patients sequentially over time based on their baseline information and time-varying covariates. In mobile health applications, these covariates are typically collected at different frequencies over a long time horizon. In this paper, we propose a deep spectral Q-learning algorithm, which integrates principal component analysis (PCA) with deep Q-learning to handle the mixed frequency data. In theory, we prove that the mean return under the estimated optimal policy converges to that under the optimal one and establish its rate of convergence. The usefulness of our proposal is further illustrated via simulations and an application to a diabetes dataset.
translated by 谷歌翻译
Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.
translated by 谷歌翻译